Goto

Collaborating Authors

 quadratic function



ImprovedAlgorithmsforConvex-Concave MinimaxOptimization

Neural Information Processing Systems

This paper studies minimax optimization problemsminxmaxyf(x,y), where f(x,y) is mx-strongly convex with respect tox, my-strongly concave with respect to y and (Lx,Lxy,Ly)-smooth. Zhang et al. [42] provided the following lower bound of the gradient complexity for any first-order method: Ω q


Proximal Approximate Inference in State-Space Models

Abdulsamad, Hany, García-Fernández, Ángel F., Särkkä, Simo

arXiv.org Artificial Intelligence

We present a class of algorithms for state estimation in nonlinear, non-Gaussian state-space models. Our approach is based on a variational Lagrangian formulation that casts Bayesian inference as a sequence of entropic trust-region updates subject to dynamic constraints. This framework gives rise to a family of forward-backward algorithms, whose structure is determined by the chosen factorization of the variational posterior. By focusing on Gauss--Markov approximations, we derive recursive schemes with favorable computational complexity. For general nonlinear, non-Gaussian models we close the recursions using generalized statistical linear regression and Fourier--Hermite moment matching.


Understanding the Role of Momentum in Stochastic Gradient Methods

Igor Gitman, Hunter Lang, Pengchuan Zhang, Lin Xiao

Neural Information Processing Systems

The use of momentum in stochastic gradient methods has become a widespread practice in machine learning. Different variants of momentum, including heavy-ball momentum, Nesterov's accelerated gradient (NAG), and quasi-hyperbolic momentum (QHM), have demonstrated success on various tasks.


A PCA-based Data Prediction Method

Daugulis, Peteris, Vagale, Vija, Mancini, Emiliano, Castiglione, Filippo

arXiv.org Artificial Intelligence

The problem of choosing appropriate values for missing data is often encountered in the data science. We describe a novel method containing both traditional mathematics and machine learning elements for prediction (imputation) of missing data. This method is based on the notion of distance between shifted linear subspaces representing the existing data and candidate sets. The existing data set is represented by the subspace spanned by its first principal components. Solutions for the case of the Euclidean metric are given.


Understanding the Role of Momentum in Stochastic Gradient Methods

Igor Gitman, Hunter Lang, Pengchuan Zhang, Lin Xiao

Neural Information Processing Systems

The use of momentum in stochastic gradient methods has become a widespread practice in machine learning. Different variants of momentum, including heavy-ball momentum, Nesterov's accelerated gradient (NAG), and quasi-hyperbolic momentum (QHM), have demonstrated success on various tasks.